home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Pascal Super Library
/
Pascal Super Library (CW International)(1997).bin
/
TABLES
/
HTABLE
/
GENTABLE.DOC
< prev
next >
Wrap
Text File
|
1989-12-15
|
9KB
|
189 lines
(************************************************************************)
Fully Generic, Fully Dynamic Hash Tables
Author : Eric C. Wentz
Phone : 1-703-860-3807 (Before 11 P.M. Eastern)
C_Serve : 72070,1015
UNIT REQUIREMENTS: Requires Unit GENDATUM
HTable.Arc Includes version 1.1
of Unit GenDatum -- previously
uploaded in GENERI.ARC -- soon to
be uploaded in GENER2.ARC.
(GenTable will work just fine
with either version of GenDatum)
Also requires the all-pervasive
Unit FLEXPNTR -- found in unit
GENERI.ARC (GenDatum needs it)
You're right -- shameless of me
to force you to download another
of my uploads!
For all you "OverLay Cowboys" out there, GenTable IS
compiled in the O+ state.
(************************************************************************)
{ Unit GenTable written by Eric C. Wentz -- last mod Date: 12/15/89 }
{ Implements a String-keyed Hash Table of Generic Datum Objects }
{ See Unit GenDatum }
{ The Hash Tables defined herein use the External Chaining solution for }
{ hashing. The Hashing Function used IS case-sensitive. }
{ Data to be entered into the Hash Table is loaded into a Generic Object }
{ (EntryRec) and then the EntryRec is Inserted/Appended into the Hash }
{ Table location determined by the Key string associated with the entry. }
{ Retrieval is accomplished by a similar, albeit reversed, process -- }
{ with the obvious exception that the entry is not deleted. Keys may be }
{ examined for Membership in the Hash Table, or a boolean Found may be }
{ returned by Retrieve. Duplicate Keys will not be inserted, and are }
{ flagged by Enter's boolean variable Duplicate. }
(**** Simulated Interface -- Private declarations removed ****
Type
KeyString = String[20]; { Maximum Key Length }
HTable = Object
Constructor Create (LoadFactor,DataSize : Word;
MaxEntries : LongInt);
{ CREATE: LoadFactor represents the Number of entries the }
{ caller would LIKE to see in each "Bucket" of the final }
{ Hash Table. Probably only achievable if you provide a }
{ PERFECT Hashing function for your Keys and Table. }
{ DataSize is the SizeOf the Elements you wish to Hash. }
{ MaxEntries is the Max Number of Table Entries you expect }
{ to need -- It does NOT actually limit the size of the }
{ Table! The Table uses MaxEntries, DataSize and LoadFactor }
{ to determine the number of "Buckets" to allocate. HTables }
{ can grow to the point of using all available Heap Space }
{ (althought this is VERY unlikely due to hidden factors), }
{ but the more "Buckets there are the better performance you }
{ will get. If Memory is no Object (oops!) (sorry), simply }
{ use 0 for both LoadFactor and MaxEntries. This will cause }
{ the Table to use as many "Buckets" as it can cram into RAM }
Destructor Destroy; Virtual;
{ DESTROY: This routine returns ALL memory used by the Table }
{ to the Heap. For Large Tables, Destroy can take a VERY }
{ long time. Fear Not, Destroy Can't lock-up your machine, }
{ even though you may well begin to think it has!!! }
Procedure Enter (TheKey : String; Var D; D_Size : Word;
Var Duplicate : Boolean);
{ ENTER: TheKey is the Key to be associated with the data in }
{ the Typeless parameter D. D_Size is the SizeOf D. }
{ Duplicate is the indication that a duplicate key has been }
{ encountered. Duplicate Keys cause the table to refuse to }
{ accept the data. }
Procedure Retrieve (TheKey : String; Var D; D_Size : Word;
Var Found : Boolean);
{ RETRIEVE: TheKey is the Key to be searched for. If NOT }
{ Found, D will contain whatever it initially had, else it }
{ will contain the data previously associated with TheKey. }
{ D_Size is the size of that data element. }
Procedure UpDate (TheKey : String; Var D; D_Size : Word);
{ UPDATE: Permits the Data associated with Key to be updated }
{ to that contained in D. }
Function Hash (TheString : String) : Word; Virtual;
{ HASH: MUST return a number in range 0..StaticLength-1 }
Function Member (TheKey : String) : Boolean;
{ MEMBER: Is Key in the Table? }
Function Empty : Boolean;
{ EMPTY: Is the Table Empty? }
{ The Following routines are provided to aid in more effective Hash function }
{ design, as well as provide statistics about the utilization of a given }
{ Table - to aid in choosing LoadFactor and MaxEntries for final application }
{ programs. They provide little that is useful beyond the development stage.}
Function StaticLength : Word;
{ STATICLENGTH: The number of "Buckets" in this Table }
Function EntryCount : Word;
{ ENTRYCOUNT: How many entries are in the Table? }
Function MaxLoad : Byte;
{ MAXLOAD: How many entries are in the "largest Bucket" ? }
Function Buckets_In_Use : Word;
{ BUCKETS_IN_USE: How many Buckets are ACTUALLY being used? }
Function AverageLoad : Real;
{ AVERAGELOAD: What is the Average Load for those "Buckets" }
{ which are currently being used? }
Function LastBucket : Word;
{ LASTBUCKET: What is the index of the last active "Bucket" ? }
Procedure Distribution_Report;
{ DISTRIBUTION_REPORT: Print to screen certain statistics }
{ regarding "Bucket" utilization. }
End;
**** End of simulated Interface ****)
The HTable Object can be visualized as a FlexArray (see GENERI.ARC) of
pointers to linked lists of records. Each pointer represents a "Bucket",
and each list represents the contents of that bucket. Internally, while
similar to the FlexArray, FlexArrays are not actually used (for code
optimization) so none of the familiar FlexArray features are available, such
as ReSize. Thus, once Created, each HTable has a certain fixed number of
buckets associated with it. This fixed number is either determined by the
user (in the call to Create) as MaxEntries Div LoadFactor, or will be
determined by Create if the user opts for 0,0. If the user selects 0,0
Create will allocate as many buckets as available memory will permit
(using the size of the internal record as size criterion). For this
reason, I do not recommend the 0,0 option except in the most unusual
circumstances. Remember, MaxEntries does not limit the entrie the table
can accept - only available memory does that - MaxEntries is simply a
device for attempting to achieve the desired LoadFactor, which of course
directly relates to HTable retrieval performance. KeyStrings are internally
limited to 20 characters simply to keep the size of these internal records
reasonable -- in practice String[255] worked just as well, but caused the
tables to gobble huge hunks of memory. Since there may be some good use
for such huge Hash keys, I have included Unit GenTabl2 which is EXACTLY
the same as GenTable, but compiled with the Maximum Key length of 255.
BE WARNED: GenTabl2 is a MEMORY HOG!
The Source code included with HTABLE.ARC is granted to the public domain.
The source code for GenDatum and GenTable (and GenTable2) is neither
included, nor granted to the public domain, but the use of the compiled
units most certainly is!
Comments, suggestions, complaints and questions should be sent to
my compuserve account (listed above), or feel free to call me
(at a reasonable hour, please!). I am especially interested in any
bugs which I may have overlooked. While I have tested these units as
much as I can conceive, I have found that testing is an art form in and
of itself, and I have not yet mastered its every little intricacy.
Enjoy!